Skip to content

fix(cypher,store): prevent crashes from buffer overflow, OOM, and NULL stmts#206

Open
map588 wants to merge 1 commit intoDeusData:mainfrom
map588:fix/crash-safety
Open

fix(cypher,store): prevent crashes from buffer overflow, OOM, and NULL stmts#206
map588 wants to merge 1 commit intoDeusData:mainfrom
map588:fix/crash-safety

Conversation

@map588
Copy link
Copy Markdown
Contributor

@map588 map588 commented Apr 5, 2026

  • cypher: Add bounds check in lex_string_literal to prevent stack buffer overflow on string literals >4096 bytes
  • cypher: Add malloc/calloc NULL checks in parse_props, parse_rel_types, parse_in_condition, and parse_case_expr to prevent OOM crashes
  • store: Add sqlite3_prepare_v2 return code checks at 3 sites in cbm_store_schema_info and collect_pkg_names to prevent NULL stmt dereference on DB corruption

Copilot AI review requested due to automatic review settings April 5, 2026 01:00
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the Cypher lexer/parser and the SQLite-backed store layer against crash scenarios triggered by unusually large inputs, out-of-memory conditions, and SQLite statement preparation failures (e.g., due to DB corruption).

Changes:

  • Cypher lexer: add a bounds check in lex_string_literal and a regression test for oversized string literals.
  • Cypher parser: add initial malloc/calloc NULL checks in several parse helpers to reduce OOM crash risk.
  • Store: add sqlite3_prepare_v2 return-code/NULL-stmt checks at three call sites to avoid NULL sqlite3_stmt* dereferences.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
src/cypher/cypher.c Adds string-literal truncation logic and some initial allocation NULL checks in parsing helpers.
src/store/store.c Adds SQLite prepare checks to avoid dereferencing NULL statements during schema/package collection.
tests/test_cypher.c Adds a regression test to ensure oversized string literals don’t overflow lexer buffers.
Comments suppressed due to low confidence (2)

src/store/store.c:2564

  • arr is allocated without a NULL check. On OOM this will be NULL and the arr[n]... writes in the row loop will crash; safe_realloc can also return NULL later in the loop. Since this PR is explicitly hardening OOM cases, consider adding allocation checks and returning an error (finalizing stmt first) if allocation fails.
        int cap = ST_INIT_CAP_8;
        int n = 0;
        cbm_label_count_t *arr = malloc(cap * sizeof(cbm_label_count_t));
        while (sqlite3_step(stmt) == SQLITE_ROW) {
            if (n >= cap) {

src/store/store.c:2591

  • arr is allocated without a NULL check. On OOM this will be NULL and the arr[n]... writes in the row loop will crash; safe_realloc can also return NULL later in the loop. Please add allocation-result checks and handle failure (including finalizing stmt) to fully prevent OOM crashes.
        int cap = ST_INIT_CAP_8;
        int n = 0;
        cbm_type_count_t *arr = malloc(cap * sizeof(cbm_type_count_t));
        while (sqlite3_step(stmt) == SQLITE_ROW) {
            if (n >= cap) {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

int blen = 0;
const int max_blen = CBM_SZ_4K - 1;
while (*pos < len && input[*pos] != quote) {
if (blen >= max_blen) { (*pos)++; continue; }
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The truncation path (if (blen >= max_blen) { (*pos)++; continue; }) skips escape-handling, so once the buffer is full a sequence like \" can be mis-tokenized: the backslash is skipped but the following quote is then seen as the closing delimiter, ending the string early. Consider still consuming escape sequences while truncating (advance past both '\' and the escaped char) so the lexer always finds the real closing quote.

Suggested change
if (blen >= max_blen) { (*pos)++; continue; }
if (blen >= max_blen) {
if (input[*pos] == '\\' && *pos + SKIP_ONE < len) {
*pos += SKIP_ONE;
}
(*pos)++;
continue;
}

Copilot uses AI. Check for mistakes.
Comment on lines +2582 to +2584
if (sqlite3_prepare_v2(s->db, sql, CBM_NOT_FOUND, &stmt, NULL) != SQLITE_OK || !stmt) {
return CBM_NOT_FOUND;
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the edge-types sqlite3_prepare_v2 fails, this returns immediately without freeing any node-label data already stored in out from the previous block. Some callers only call cbm_store_schema_free on CBM_STORE_OK (e.g., UI layout3d), so this can leak memory. Consider using a single cleanup path that frees any partially-filled out before returning an error.

Copilot uses AI. Check for mistakes.
Comment on lines 471 to +476
int cap = CYP_INIT_CAP4;
int n = 0;
cbm_prop_filter_t *arr = malloc(cap * sizeof(cbm_prop_filter_t));
if (!arr) {
return CBM_NOT_FOUND;
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds an initial malloc NULL-check, but the function still calls safe_realloc and heap_strdup later without checking for NULL. Since safe_realloc returns NULL (and frees the old pointer), an OOM during growth or key/value duplication can still lead to NULL dereferences (e.g., later strcmp on a NULL props[i].value). Consider propagating an error if any allocation/duplication fails and freeing partially-built arr.

Copilot uses AI. Check for mistakes.
Comment on lines 574 to +579
int cap = CYP_INIT_CAP4;
int n = 0;
const char **types = malloc(cap * sizeof(const char *));
if (!types) {
return CBM_NOT_FOUND;
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds an initial malloc NULL-check, but types can still become NULL if a later safe_realloc fails (it returns NULL and frees the old buffer), and heap_strdup can return NULL as well. To fully prevent OOM crashes, please check realloc/duplication results, free any already-duplicated strings on failure, and return an error.

Copilot uses AI. Check for mistakes.
Comment on lines 770 to +777
int vcap = CYP_INIT_CAP8;
int vn = 0;
const char **vals = malloc(vcap * sizeof(const char *));
if (!vals) {
free((void *)c->variable);
free((void *)c->property);
free((void *)c->op);
return NULL;
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to check the initial malloc, but the list-growth path later uses safe_realloc without checking for NULL (it frees the old pointer on failure), and each element is populated via heap_strdup which may return NULL. Either can still produce a NULL dereference later when evaluating IN conditions (e.g., strcmp(actual, c->in_values[i])). Please add checks and clean up partial allocations on failure.

Copilot uses AI. Check for mistakes.
Comment on lines 1077 to +1086
cbm_case_expr_t *kase = calloc(CBM_ALLOC_ONE, sizeof(cbm_case_expr_t));
if (!kase) {
return NULL;
}
int bcap = CYP_INIT_CAP4;
kase->branches = malloc(bcap * sizeof(cbm_case_branch_t));
if (!kase->branches) {
free(kase);
return NULL;
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checks calloc/malloc for the initial CASE expression allocations, but the branch-growth path later uses safe_realloc without checking for NULL (and safe_realloc frees the old buffer on failure). To avoid NULL dereferences/leaks on OOM, please check the realloc result and free kase/branches before returning an error.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 497 to +502
if (n >= cap) {
cap *= PAIR_LEN;
arr = safe_realloc(arr, cap * sizeof(cbm_prop_filter_t));
if (!arr) {
return CBM_NOT_FOUND;
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

safe_realloc frees the original buffer on failure (see platform.h), so if this growth realloc fails you return with arr already freed but any previously heap_strdup'd keys/values are leaked (and you can't clean them up anymore). Use a non-destructive realloc pattern (tmp=realloc; on failure, free arr[i].key/value for i<n, then free(arr)) so OOM handling is leak-free.

Copilot uses AI. Check for mistakes.
Comment on lines 603 to +608
if (n >= cap) {
cap *= PAIR_LEN;
types = safe_realloc(types, cap * sizeof(const char *));
if (!types) {
return CBM_NOT_FOUND;
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same safe_realloc issue as in parse_props: on realloc failure the types array is freed, but already-duplicated strings in types[0..n-1] leak. Prefer realloc into a temporary pointer so you can free the existing types[i] strings and the array on failure.

Copilot uses AI. Check for mistakes.
Comment on lines +794 to +801
vcap *= PAIR_LEN;
vals = safe_realloc(vals, vcap * sizeof(const char *));
if (!vals) {
free((void *)c->variable);
free((void *)c->property);
free((void *)c->op);
return NULL;
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On realloc failure, safe_realloc frees the existing vals array, so returning here leaks all heap_strdup'd values already stored in vals[0..vn-1]. Use realloc into a temp pointer (or a realloc helper that does not free on failure) so you can free the accumulated vals[i] strings + vals before returning.

Suggested change
vcap *= PAIR_LEN;
vals = safe_realloc(vals, vcap * sizeof(const char *));
if (!vals) {
free((void *)c->variable);
free((void *)c->property);
free((void *)c->op);
return NULL;
}
const char **new_vals;
int i;
vcap *= PAIR_LEN;
new_vals = realloc((void *)vals, vcap * sizeof(const char *));
if (!new_vals) {
for (i = 0; i < vn; i++) {
free((void *)vals[i]);
}
free((void *)vals);
free((void *)c->variable);
free((void *)c->property);
free((void *)c->op);
return NULL;
}
vals = new_vals;

Copilot uses AI. Check for mistakes.
Comment on lines 1111 to +1118
if (kase->branch_count >= bcap) {
bcap *= PAIR_LEN;
kase->branches = safe_realloc(kase->branches, bcap * sizeof(cbm_case_branch_t));
if (!kase->branches) {
expr_free(when);
free(kase);
return NULL;
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On branch growth, safe_realloc frees the existing kase->branches array on failure; returning here then leaks all previously parsed when_expr trees and then_val strings already stored in the old branches (since you no longer have the array to iterate). Switch to a non-destructive realloc pattern and, on failure, call free_case_expr(kase) (or equivalent partial cleanup) before returning.

Copilot uses AI. Check for mistakes.
Comment on lines +2555 to +2557
if (sqlite3_prepare_v2(s->db, sql, CBM_NOT_FOUND, &stmt, NULL) != SQLITE_OK || !stmt) {
return CBM_NOT_FOUND;
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If sqlite3_prepare_v2 fails it can still leave a non-NULL statement handle in some cases (e.g., partial compile). Returning without finalizing risks leaking stmt. Consider capturing rc, and if stmt is non-NULL always sqlite3_finalize(stmt) before returning (and optionally record the SQLite error).

Copilot uses AI. Check for mistakes.
Comment on lines +2582 to +2585
if (sqlite3_prepare_v2(s->db, sql, CBM_NOT_FOUND, &stmt, NULL) != SQLITE_OK || !stmt) {
cbm_store_schema_free(out);
return CBM_NOT_FOUND;
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: on prepare failure, ensure any non-NULL stmt is finalized before returning. Right now this path can leak stmt (and you already free out, which is good).

Copilot uses AI. Check for mistakes.
int max_pkgs) {
sqlite3_stmt *stmt = NULL;
sqlite3_prepare_v2(s->db, sql, CBM_NOT_FOUND, &stmt, NULL);
if (sqlite3_prepare_v2(s->db, sql, CBM_NOT_FOUND, &stmt, NULL) != SQLITE_OK || !stmt) {
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On prepare failure, consider finalizing stmt if it is non-NULL before returning to avoid leaking a partially prepared statement. Also consider recording the underlying SQLite error so callers have some diagnostics when DB corruption triggers this path.

Suggested change
if (sqlite3_prepare_v2(s->db, sql, CBM_NOT_FOUND, &stmt, NULL) != SQLITE_OK || !stmt) {
int rc = sqlite3_prepare_v2(s->db, sql, CBM_NOT_FOUND, &stmt, NULL);
if (rc != SQLITE_OK || !stmt) {
if (stmt != NULL) {
sqlite3_finalize(stmt);
}

Copilot uses AI. Check for mistakes.
…L stmts

- cypher: Add bounds check in lex_string_literal to prevent stack buffer
  overflow on string literals >4096 bytes. Escape sequences are always
  parsed correctly even past the truncation boundary.
- cypher: Add malloc/calloc NULL checks in parse_props, parse_rel_types,
  parse_in_condition, and parse_case_expr. Growth paths use non-destructive
  realloc (temp pointer) so accumulated elements can be freed on OOM
  instead of leaking through safe_realloc's free-on-failure semantics.
- store: Add sqlite3_prepare_v2 return code checks at 3 sites in
  cbm_store_schema_info and collect_pkg_names. Partially prepared
  statements are finalized before returning. Schema function cleans up
  partially populated output on failure. collect_pkg_names returns
  CBM_NOT_FOUND (not 0) to distinguish errors from empty results.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +508 to 512
arr = tmp;
cap = new_cap;
}
arr[n].key = heap_strdup(key->text);
arr[n].value = heap_strdup(val->text);
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_props assigns arr[n].key/value from heap_strdup without checking for NULL. If heap_strdup fails, later code like check_inline_props() will call strcmp(props[i].value, ...) and can segfault. After heap_strdup, validate the returned pointers; on failure, free any previously allocated entries (and arr itself) and return an error.

Copilot uses AI. Check for mistakes.
Comment on lines +620 to 624
types = (const char **)tmp;
cap = new_cap;
}
types[n++] = heap_strdup(t->text);
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_rel_types stores relationship type strings via heap_strdup but never checks for NULL. If an allocation fails, rel->types will contain NULL and later strcmp/SQLite binding paths can dereference it. After heap_strdup, check for NULL and, if needed, free any previously duplicated types and return an error.

Copilot uses AI. Check for mistakes.
Comment on lines +818 to 823
}
vals = (const char **)tmp;
vcap = new_vcap;
}
vals[vn++] = heap_strdup(advance(p)->text);
} else {
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_in_list appends list items via heap_strdup without checking for NULL. eval_condition later does strcmp(actual, c->in_values[i]) and will crash if any entry is NULL. Check heap_strdup’s return; on failure, free vals (including already-duplicated strings) and c’s allocated fields, then return NULL with an error set.

Copilot uses AI. Check for mistakes.
if (!tmp) {
expr_free(when);
for (int i = 0; i < kase->branch_count; i++) {
expr_free(kase->branches[i].when_expr);
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On realloc failure, the cleanup loop frees only when_expr for existing branches but not branches[i].then_val. That leaks the already-duplicated THEN values (and can be substantial for large queries) right in the OOM path this change is trying to harden. Free then_val for each existing branch in the failure cleanup before freeing branches/kase.

Suggested change
expr_free(kase->branches[i].when_expr);
expr_free(kase->branches[i].when_expr);
free((void *)kase->branches[i].then_val);

Copilot uses AI. Check for mistakes.
if (stmt) {
sqlite3_finalize(stmt);
}
return CBM_NOT_FOUND;
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collect_pkg_names returns CBM_NOT_FOUND on sqlite3_prepare_v2 failure, which is the same numeric value as CBM_STORE_ERR (-1) but semantically inconsistent with other store helpers that return CBM_STORE_ERR and call store_set_error_sqlite(). Consider returning CBM_STORE_ERR (or 0 if you intend “no pkgs”) and setting the store error so downstream callers can diagnose DB corruption.

Suggested change
return CBM_NOT_FOUND;
store_set_error_sqlite(s, "sqlite3_prepare_v2 failed in collect_pkg_names");
return CBM_STORE_ERR;

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants